Bioinformatics A Practical Guide to Next Generation Sequencing Data Analysis (Hamid D. Ismail)

300 ◾ Bioinformatics

qiime tools view diversity-indices/faith-pd-group-significance.qzv

qiime diversity alpha-group-significance \

--i-alpha-diversity diversity-indices/shannon_vector.qza \

--m-metadata-file data/sample-metadata.tsv \

--o-visualization diversity-indices/shannon-group-significance.qzv

qiime tools view diversity-indices/shannon-group-significance.qzv

These commands will run all-group and pairwise Kruskal–Wallis tests (non-parametric

analysis of variance). The visualization files show boxplots and test statistics for each meta-

data grouping.

We will analyze sample composition (beta diversity group distances) in the context

of categorical metadata using PERMANOVA. Note: The qiime diversity beta-group-

significance command computes only one metadata grouping at a time, so to test the

differences between groups we have to indicate the appropriate column name from the

metadata file. In addition, if we call this command with --p-pairwise parameter, it will

perform pairwise tests that will allow us to determine which specific pairs of groups are

different from one another in terms of dispersion. We will apply a PERMANOVA to test

for significant differences of the weighted UniFrac metrics between the samples.

qiime diversity beta-group-significance \

--i-distance-matrix diversity-indices/weighted_unifrac_distance_

matrix.qza \

--m-metadata-file data/sample-metadata.tsv \

--m-metadata-column group \

--o-visualization \

diversity-indices/weighted-unifrac-life-stage-significance.qzv \

--p-pairwise

qiime tools view \

diversity-indices/weighted-unifrac-life-stage-significance.qzv

Finally, we will use the Emperor tool to explore the microbial community composition

using principal coordinate analysis (PCoA) plots in the context of sample metadata.

qiime emperor plot \

--i-pcoa diversity-indices/weighted_unifrac_pcoa_results.qza \

--m-metadata-file data/sample-metadata.tsv \

--o-visualization diversity-indices/weighted-unifrac-emperor-life-

stage.qzv

qiime tools view diversity-indices/weighted-unifrac-emperor-life-

stage.qzv

7.4 SUMMARY

The amplicon-based sequencing is targeting a specific marker gene that is able to distinguish

species. Hence, it is used to identify species in a sample that contains multiple microbes

such as environmental and clinical samples. The 16S rRNA gene is usually targeted in the